STANFORD  ARTIFICIAL  INTELLIGENCE  PROJECT 
MEMO  AIM-llfc 


ABRIL,  1970 


ON  THE  SYNTHESIS  OF  FINITE-STATE  ACCEPTORS 


by 

A.  W.  Biermann  and  J.  A.  Feldman 
Computer  Science  Department 
Stanford  University 


ABSTRACT:  Two  algorithms  are  presented  for  solving  the  following 

problem:  Given  a  finite-set  S  of  strings  of  symbols, 
find  a  finite-state  machine  which  will  accept  the  strings 
of  S  and  possibly  seme  additional  strings  which 
"resemble"  those  of  S  .  The  approach  used  is  to 
directly  construct  the  states  and  transitions  of  the 
acceptor  machine  from  the  string  information.  The 
algorithms  include  a  parameter  which  enable  one  to 
increase  the  exactness  of  the  resulting  machine's 
behavior  as  much  as  desired  by  increasing  the  number  of 
states  in  the  machine.  The  properties  of  the  algorithms 
are  presented  and  illustrated  with  a  number  of  examples. 

The  paper  gives  a  method  for  identifying  a  finite-state 
language  from  a  randomly  chosen  finite  subset  of  the 
language  if  the  subset  is  large  enough  and  if  a  bound 
is  known  on  the  number  of  states  required  to  recognize 
the  language.  Finally,  we  discuss  some  of  the  uses  of  the 
algorithms  and  their  relationship  to  the  problem  of 
grammatical  inference. 
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ON  THE  SYNTHESIS  OF  FINITE-STATE  ACCEPTORS 

by 

A.  W.  Biermann  and  J.  A.  Feldman 

1.  Introduction 

An  acceptor  is  a  finite-state  machine  which  receives  strings  of 
symbols  as  input  and  which  responds  to  each  string  with  an  answer  of 
either  "yes"  or  "no";  that  is,  it  accepts  or  rejects  each  string.  Tide 
paper  discusses  the  problem  of  constructing  an  acceptor  for  a  particular 
finite  set  S  of  strings  and  perhaps  some  additional  strings  which 
"resemble"  those  in  S  .  We  present  two  algorithms  for  constructing 
such  a  machine  from  S  and  from  additional  information  about  the 
required  preciseness  of  the  machine’s  behavior.  The  algorithms 
presented  enable  one  to  obtain  varying  degrees  of  accuracy  with 
corresponding  varying  degrees  of  machine  complexity.  Thus,  if  the 
acceptor  is  required  to  accept  only  the  strings  of  S  and  no  ouhers, 
it  can  be  expected  to  require  many  more  states  than  if  a  large  number  of 
"extra"  strings  are  allowed  to  be  in  the  accepted  set. 

There  are  a  number  of  finite-state  machine  synthesis  algorithms  in 
the  literature.  Huffman  [9],  Mealy  [10],  and  others  have  developed 
algorithms  for  sequential  machine  design  when  sane  kind  of  transition 
table  or  state  diagram  is  given.  Ott  and  Feinstein  [11],  Brzozowski  [1], 
and  others  have  given  methods  for  constructing  acceptors  from  their 
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regular  expressions.  This  paper  is  concerned  with  the  problem  of 
designing  a  finite-state  acceptor  when  no  simple  transition  table, 
state  diagram,  or  regular  expression  is  available. 

Ginsburg  [5,  6]  gives  an  algorithm  for  synthesizing  sequential 
machines  from  input-output  behavior,  a  problem  similar  to  the  one 
dealt  with  here.  However,  our  algorithms  are  concerned  with  the  desigr. 
of  a  different  type  of  device,  an  acceptor,  and  the  methods  presented 
are  distinctly  different  from  those  of  [5,  6}. 

The  techniques  which  are  described  here  grew  out  of  an  idea  by 
Feldman  [2]  who  was  attempting  to  infer  finite-state  grammars  for  sets 
of  strings.  Feldman’s  idea  suggested  a  method  of  creating  states  and 
transitions  from  string  information,  and  this  concept  became  the  core 
of  the  algorithms  which  were  subsequently  developed. 

In  this  paper  we  will  show  hew  to  construct  a  machine  A(S,k) 
which  is  an  acceptor  of  set  S  (Section  2) .  Other  properties  of 
A(S,k)  will  be  investigated  with  examples  given  (Section  3),  and  its 
applications  will  be  discussed  (Section  4).  Finally,  a  second  algorithm 
and  its  properties  will  be  investigated  (Section  5). 

2.  A  Finite-State  Acceptor 

We  introduce  a  number  of  definitions  largely  following  the  notation 
of  Ginsburg  [6], 

Definition  2.1.  A  nondeterministic  automaton  A  is  a  five-tuple 
<Q,2,f,Qo,F>  where 
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Q  is  a  finite  nonempty  set  (of  states) 

£  is  a  finite  nonempty  set  (of  inpat  symbols) 

f  is  a  mapping  Q  x  £  -  2^  (the  transition  function) 

Qq  is  a  subset  of  Q,  (the  set  of  initial  states) 

F  is  a  subset  of  Q,  (the  set  of  final  states)  . 

*  0 

The  function  f  is  extended  to  a  mapping  Q  x  £  -2  by  the 
recursive  definition 

?(q>A)  -•  {q} 

whe^e  A  is  the  string  of  lengoh  zero  and  qeCi  ,  and 

f(q,wa)  =  U  f(q*,a) 
q‘ef(q,w) 

* 

where  wc£  and  ae£  . 

Definition  2.2.  The  language  L(A)  of  the  nondeterministic 
automaton  A  will  be  defined  to  be  the  set  of  strings  w  =  a^a2...a^  , 

a^e£  for  1  <  i  <  j  ,  such  that  there  is  a  sequence  of  states 

qo,qi,...,qj  with  the  properties 

<l> 

(2)  qA  €  f(qi_1,ai)  for  1  <  i  <  j 

(5)  q^F  . 

We  will  be  interested  in  the  relation  of  the  languages  of  various 
automata  to  a  fixed  set  S  of  strings.  If  Sc  L(A)  ,  then  A  will 

be  said  to  accept  S  .  If  S  =  L(A)  ,  then  A  will  be  said  to  accept 

exactly  S  . 
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After  a  preliminary  definition,  we  will  show  how  to  construct  a 
class  of  automata  which  will  be  shown  to  accept  S  . 

* 

Definition  2.3.  The  k-tail  of  z  with  respect  to  S  c  E  will 

* 

be  denoted  as  g(z,S,k)  and  will  be  defined  as  follows:  Let  zeE 

* 

be  such  that  zw  e  S  for  some  we£  ,  and  let  k  be  a  nonnegative 

* 

integer.  Then  g(z,S,k)  is  defined  as  the  set  of  strings  we£ 
with  the  properties 

(a)  zweS 

(b)  length(w)  <  k  . 

g(z,S,k)  is  undefined  if  z  and  k  are  outside  of  the  domains 
specified. 

The  acceptor  of  set  S  will  be  determined  from  the  set  S  and 
from  the  look-ahead  level  k  and  will  be  denoted  A(S,k)  . 

definition  2.L.  If  S  is  a  finite  set  of  strings  from  E  ,  let 
A(S,k)  be  the  nondeterministic  automaton 

A(S,k)  =  <k,£,f,Qo,F> 

rfhere 

* 

Q  =  {a (2  |  g(z,S,k)  -  q  for  some  zeE  } 

E  =  a  finite  nonempty  set  of  input  symbols 

f(q,a)  =  {q'eQ  |  there  is  a  ze£  such  that  g(z,S,k)  =  q 
and  g(za,S,k)  =  q'} 

Q0  =  (g(A,S,k)} 

F  =  {qeQ|Acq}  . 


The  machine  A(S,  k)  thus  has  as  states  the  set  of  all  k- tails 
which  can  be  constructed  from  S  «  A  transition  from  k-tail  S.^  to 
k-tail  S2  under  input  symbol  b  will  occur  if  there  is  a  string  z 
with  k-tail  S1  and  zb  has  k-tail  Sg  . 

The  set  S  will  be  said  to  yield  the  language  L(A(S,k)) 

(at  look-ahead  level  k)  . 

Example  2.1.  Suppose  as  an  illustration  of  Definition  2.U  that 
we  consider  the  example  where  S  =  {a,ab,?,bb}  and  k  =  1  .  Then 
A(S,k)  =  <Q,E,f,Qo,F>  where 

Q  =  {{a},{A,b},fA}} 

E  =  {a,b} 

f  =  ({a}, a)  =  {{A,b]} 
f  -  ({A,b},b)  =  {{A},{A,b}} 

Q0  -  {{a}} 

F  =  {{A},{A,b}} 


final  state: 
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Note  that  the  resulting  machine  which  is  diagramed  in  Figure  1  recognizes 
* 

the  set  ab  .  A(S,k)  is  typically  not  in  minimal  fora  as  is  the  case 
here,  but  minimization  can  always  be  done  by  well  known  algorithms 
f1*,  5j  6,  8].  If  k  had  been  set  at  2  or  larger,  we  would  obtain 
L(A(S,k))  =  S  . 

Theorem  2.1.  S  c  L(A(S,k))  for  all  nonnegative  k  . 

Proof.  If  w  =  a1a2a^...a^  eS  ,  a^eE  then  let  qQ  =  g(A,S,k) 
and  q^  =  g(a1a2« . .ai,S,k)  for  i  =  1,2,3, ...,j  .  The  sequence  of 
states  qQ,q1,q2, . . .,qj  satisfy  the  three  properties  of  Definition  2.2 
so  we  have  weL(A(S,k))  .  This  completes  the  proof. 

The  construction  of  Definition  2,k  has  provided  states  to 
account  for  all  possible  k-continuations  of  heads  of  strings  in  S 
so  A(S,k)  will  surely  be  able  to  accept  S  .  The  next  section 
is  concerned  with  stronger  requirements  on  A(S,k)  and  the 
consequences  of  varying  the  look-ahead  lc,rel  k  . 


3-  Further  Properties  of  A(S,k)  k  ^ 

L  ffi 

The  machine  A(S,,k)  will  have  no  more  than  2i~°  states  if  S 
is  from  an  alphabet  2  of  m  distinct  symbols  so  that  the  upper  bound 
on  a  machine's  size  can  be  adjusted  by  setting  the  value  of  k  .  Thus 
we  can  expect  A(S,k)  to  increase  greatly  in  "computing  power"  as  k 
is  made  larger.  For  example,  if  L(A(S,k))  is  considered  to  be  an 


approximation  to  S  ,  we  can  expect  the  approximation  to  be  such  loiter 
if  k  is  larger  and,  in  fact,  S  =  L(A{S,k))  if  k  is  as  large  as  the 
length  of  the  longest  string  in  S  as  will  be  crown  below.  From 
another  point  of  view,  we  can  consider  L(A{S,k))  to  be  a  guess  of  the 
language  from  which  the  saaple  S  has  been  chosen.  If  k  is 
very  small,  the  "guess"  of  1^  will  constitute  a  very  liberal  inference 
and  may  include  most  of  the  strings  from  the  alphabet  of  S  .  If  k 
is  as  large  as  the  longest  string  in  S  ,  however,  the  inference  will 
be  very  conservative  and  will,  in  fact,  include  only  the  strings  of  S  . 

All  of  this  will  be  made  precise  in  the  paragraphs  that  follow. 

The  first  property  to  confirm  is  that  L(A(S,k))  =  S  if  k  is 
large  enough,  and  the  proof  will  make  use  of  the  following  obvious 
Lemma. 

Lemma  3.1.  Let  h(z,S)  =  fweZ  |  zweS}  .  Then  if  k  is  greater 
than  or  equal  to  the  length  of  the  longest  string  in  S  ,  g(z,S,k)  =  h(z,S) 
for  all  zeE  such  that  g(z,S,k)  is  defined. 

Theorem  3.1.  L(A(S,k))  =  S  if  k  is  greater  than  or  equal  to 

the  length  of  the  longest  string  in  S  . 

Proof.  Employing  the  Lemma  and  the  definition  of  A(S,k)  ,  we 

r* 

have  A(S,k)  =  =  fqe2  |q  =  h(z,S)  },E,f,QQ  =  (h(A,S)  ],F  =  fqeQ|Aeh(z,S)  }> 

♦ 

where  f(q,a)  =  { q f eQ  j  there  is  zeE  such  that  q=h(z,S)  and  4'  =h(za,S)} 
or  f(h(z,S),a)  =  fh(zc,S)}  if  h(za,S)  is  not  empty.  It  follows  that 


Then 


f(h(z,S),w)  =  {h(zw,S)l  for  all  we£  if  b(zv,Sj  is  not  eepby. 
f(h{A>S),w)  =  (k(v,S)J  .  But  h(A,S)  is  the  initial  state  so  v  will 
be  accepted  by  A(S,k)  if  and  only  if  h(v,S)  is  a  final  state.  That 
is  true  if  and  only  if  Arh(w,S)  which  holds  if  and  only  if  weS  . 

We  conclude  that  weL(A{S,k})  if  and  only  if  weS  which  completes 
the  proof. 


The  lower  bound  given  on  k  is  in  general  the  best  bound  which 
can  be  obtained  as  can  be  seen  by  applying  the  construction  of  Definition 
2.h  to  the  set  a  ~  {A, a, a',  .  ..,a  }  .  It  say  also  be  worth  mentioning 
that  the  machine  of  Tneorsc  2.1  will  be  deterministic  and  in  its  mini  ami 
form. 

We  will  next  investigate  the  languages  L(A(S,k))  as  k  is 
varied  and  note  their  relationship  to  each  other.  It  turns  out  that 
L(A(S,k))  "covers"  L(A{?,k+i))  in  the  sense  of  Reynolds  [12]  for 
nonnegative  i  .  In  fact,  the  next  theorem  could  be  derived  from 
Reynold's  results. 

Theorem  3.2.  L(A(S,k+l))  C  L(A(S,k))  . 

Proof.  Assume  w  -  a1a„...a.  f.L(A(5,k+i))  ,  a.e£  for  1  <  i  <  j  . 
Then  there  is  a  sequence  of  states  q  in  A(S,k+i)  with 

O  J-  J 

the  properties 

(1)  t<0l  -  Q0 

(2)  qi+l '  f0r  0  S  1  S  J"1 

and 


(3) 


A€<1j)  • 

* 

Furthermore,  there  is  a  string  z^eZ  such  that  z^w^cS  for  some 
£ 

w^Z  where  qi  =  gCz^Sjktl)  and  qi+i  =  g(z^ai+.^,S,k+l)  for  each 
i  =  0,1,2, .. .,j-l  .  Next  consider  A(S,k)  and  the  sequence  of  states 
q^,q^, ...,qj  where  q£  =  q^  -  {all  strings  in  q^  of  length  k+1}  . 
These  states  certainly  exist  in  A(S,k)  and  have  properties  analogous 
to  (1),  (2),  and  (3)  above.  To  justify  (2),  for  example  simply  employ 
the  zi,s  defined  above  and  observe  that  q£  =  g(z^,S,k)  and 
<L[+1  =  g(zia>S>k)  will  occur  so  that  q»+1  e  f (q* , ai+1)  for 
0  <  i  <  j-1  .  Therefore  weL(A(S,k))  and  the  proof  is  complete. 

The  last  three  theorems  combine  to  give  a  good  picture  of  how  S 
is  related  to  L(A(S,k))  as  k  is  varied,  and  Figure  2  illustrates 
the  situation.  Each  language  L(A(S,k))  will  include  S  and  will  be 
included  by  L(A(S,k-l)  .  k  is  thus  a  parameter  of  the  algorithm 
which  can  be  used  to  adjust  L(A(S,k))  to  be  as  close  t j  S  as  desired 
at  the  cost  of  increasing  the  number  of  states  in  the  acceptor. 
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Suppose  that  the  set  of  strings  S  is  chosen  from  some  finite-state 
language  LQ  .  It  is  desired  to  know  what  relationship  the  language 
L(A(S,k))  may  have  to  and  under  what  conditions  one  can  expect, 
for  example,  equality  between  the  languages.  Some  information  can  be 
obtained  by  turning  the  problem  around  and  asking  the  following  question: 
If  we  are  given  finite-state  language  ,  how  do  we  construct  a  finite 
set  S  and  how  do  we  set  k  in  order  to  obtain  L(A(S,k))  =  LQ  ?  The 
answer  is  to  consider  the  minimal  deterministic  automaton  M  which  will 
accept  exactly  LQ  and  to  construct  S  in  such  a  way  that  A(S,k) 
will  be  equivalent  to  M  .  If  M  has  n  states  then  we  sat  the 
look-ahead  level  k  to  equal  n-2  since  the  states  of  M  can  be 
characterized  by  their  behavior  n-2  steps  into  the  future.  This 
analysis  will  now  he  formally  carried  out. 

Definition  3.1.  A  finite-state  deterministic  automaton  M  is  a 
five-tuple  <P,E,d,pQ,D>  where 

P  is  a  finite  nonempty  set  (of  states) 

E  is  a  finite  nonempty  set  (of  input  symbols) 

d  is  a  mapping  PxE  -*  P  (the  transition  function) 

pQeP  (the  initial  state) 

D  is  a  subset  of  P  (the  set  of  final  states) . 

The  function  d  is  extended  to  a  mapping  PxE  -*  P  as  the 
function  f  was  above. 
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Definition  3.2.  The  language  L(M)  of  the  deterministic 


automaton  M  will  be  defined  as 

L(M)  =  (wsE*  |  d(p0,v)oD)  . 

Definition  5.5. 

(1)  A  state  in  a  machine  M  will  be  called  £  -reachable 
if  there  is  a  string  w  of  length  l  or  less  such  that  d(pQ,w)  =  p^ 

(2)  The  states  of  machine  M  will  be  called  k-distinguishable 
if  for  each  pair  of  distinct  states  p^  and  p^  in  M  there  is  a 
string  w  with  length  (w)  <  k  such  that  exactly  one  of  the  states 
^(p^w)  or  d(pj,v)  is  in  D  . 

( 3)  Consider  the  set  <3-  of  machines  M  whose  states  are 

all  £ -reachable  and  k-distinguishable.  Then  Z(£,k)  will  be  defined 
as  follows: 

Z(£,k)  =  {Lq  |  Lq  =  L(M)  and  Me£}  . 

Definition  3.4.  The  symbol  M  will  henceforth  be  used  to 
designate  the  minimal  deterministic  machine  such  that  L(M)  =  LQ 
where  LQ  is  the  language  we  are  considering  at  the  moment. 

Definition  3«5»  If  w^eE  then  w  •  {wj  i  =  1,2, . . .,  j}  = 
tw0wi  |  i  =  1, 2, 3, . . . ,  j  }  . 

Theorem  3.3.  If  LQeZ(£,k)  then  LQ  =  l(A(S,k))  if  S  is 
constructed  as  follows: 


■ssvrwpf * 
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(1)  Choose  strings  z,,z_, .  ,.,z  such  that  for  every  state  p 

x  c  in 

in  M  (where  L(M)  =  LQ)  there  is  a  z^  =  such  that  d(pg,u^)  =  p 

(2)  S  =  \J  u  •  g(u,L0,kfl+cru) 

* 

ue£  such  that 
z^  =uv  for  seme  z ^ 

where 

*  1  if  g(ua,LQ,k)  =  <p  and  g(ua,L0,fcH)  #  cp  for  ael 

=  0  otherwise. 

<p  designates  the  empty  set. 

Proof.  An  informal  justification  will  be  included  here,  and  a 
more  detailed  proof  will  appear  in  the  Appendix. 

S  is  constructed  so  that  each  state  p  in  M  has  a  counterpart 
in  A(S,k)  ,  namely  the  k-tail  g(u,LQ,k)  where  d(pQ,u)  =  p  . 
Furthermore,  each  successor  to  p  ,  specifically  p*  =  d(p,a)  (ae£) 
must  have  a  counterpart  g(ua,LQ,k)  in  A(S,k)  which  is  a  successor 
(under  a  )  to  g(u,  V  k)  .  To  guarantee  that  A(S,k)  will  contain 
g(u,I^,k)  and  all  of  its  successors,  the  set  u  •  g(u,LQ,k+-l)  is 
included  in  S  .  The  set  of  u’s  is  defined  so  that  every  state  p  in 
M  will  have  a  counterpart  in  A(S,k)  along  with  correctly  assigned 
successors.  (For  the  moment,  we  ignore  the  quantity  cr^  .) 

The  fact  that  A(S,k)  will  accept  Lq  is  clear  because  its 
construction  has  insured  its  ability  to  simulate  M  .  However,  many 
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other  states  and  transitions  may  appear  in  A(S,k )  so  one  might  think 
A(S,k)  would  accept  strings  which  are  not  in  LQ  .  This  will  not 
occur  because  the  only  other  states  which  will  appear  in  A(S,k)  will 
have  the  fora  g(u,LQ,k-i)  with  0  <  i  <  k  .  The  transitions  will  be 
such  that  A(S,k)  will  be  in  state  g(u,L0,k-i)  only  when  A(S,k) 
is  also  in  state  g(u,  I^,k)  .  (Remember  that  A(S, k)  is  nondeterainistic.) 
Therefore  no  strings  will  be  accepted  by  A(S,k)  which  are  not  also 
accepted  by  M  . 

The  term  a  arises  in  the  case  where  one  of  the  successors 

U 

g(ua,LQ,k)  to  a  state  g(u,LQ,k)  is  the  empty  set  <p  .  Then  A(S,k) 
may  not  include  the  transition  f(g(u,I^,k),a)  =  g(ua,LQ,k)  because 
g(ua,LQ,k)  may  not  have  been  created  as  a  successor  to  g(u,LQ,k)  . 

This  problem  is  remedied  by  including  u  '  g(u,L0,k*-l+ou)  in  S  where 
or  «  1  •  The  reader  is  referred  to  the  appendix  for  £  detailed  proof 
of  the  theorem. 

Definition  3.6.  A  set  ScLfl  which  is  constructed  as  described 
by  Theorem  3.5  will  be  called  special  (for  language  at  level  k  ). 

Example  3.1.  As  air  example,  consider  the  automaton  of  Figure  3 
which  accepts  all  of  the  strings  on  a  three  letter  alphabet 
exactly  one  A  . 


Ik 


whicl.  have 


Figure  3.  Example  3.1. 

Only  one  string  =  AA  is  necessary  to  satisfy  (1)  of  Theorem  3«3. 
Letting  u  =  A  ,  u’  =  A  ,  u"  =  AA  ,  and  k  =  1  ,  we  obtain 
S  -  u  •  g(u,LQ,k<-l)  U  u*  •  g(u',L0,k+l)  U  u"  •  g(u",L0,k+l) 

=  { A, AB, AC, BA, C A]  U  {ABB, ABC, ACB, ACC}  U  {  }  .  This  results  in  the 
machine  A(S,1)  which  accepts  exactly  the  desired  language. 

Theorem  3*3  is  important  because  it  indicates  about  how  much 
information  must  be  obtained  from  LQ  before  it  can  be  recognised  by 
the  algorithm.  For  most  problems,  Theorem  3.3  will,  with  appropriately 
chosen  ,  give  the  S  with  the  smallest  possible  number  of  strings 

such  that  S  will  yield  LQ  .  It  is  true  that  there  are  sets  S 
which  yield  LQ  and  which  are  not  special,  but  they  are  generally 
larger.  A  more  precise  characterization  of  the  sets  which  yield  LQ 
is  possible,  but  it  is  too  canplex  to  be  included  here. 

It  will  also  be  noted  but  not  proved  that  if  S  is  special  for  LQ 
and  S  c  S’  c  Lq  ,  then  LQ  =  L(A(S,k))  c  L(A(S’,k))  .  The  inclusion 
of  a  special  set  in  S'  insures  that  states  and  transitions  will  appear 
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in  L(A(S',k))  which  will  accept  all  of  the  strings  in  L.  . 
additional  strings  in  S'  may  add  states  and  transitions  to  L(A(S*,k)) 
which  cause  it  to  accept  strings  which  are  not  in  1^  .  Thus  a 
randomly  chosen  subset  of  LA  which  contains  a  special  set  will  yield 
a  language  which  contains  Lq  . 

Two  useful  corollaries  follow  frcm  Theorem  3 -3 • 

Definition  3.7 •  =  [weS  j  length  (w)  <  i}  . 

Corollary  3.1.  If  LQeZ(l,k)  then  Lr  =  l(A(L0|.,k))  if 

i  >  l+k+l+a  where  a  =  1  if  the  state  ti  «  S  }  is  in  A(Lq|<_^>  k) 
and  a  =  0  otherwise. 

Proof.  Choose  the  strings  z^, z2> '  *  •  * zm  1x1  (1)  Theorem  3-3 

to  be  all  the  strings  in  LQ  which  have  length  x-k-l-c  .  Then  S 
as  defined  in  (2)  will  be  exactly  Lq|^  . 

Corollary  3.2.  if  lq  =  L(M)  where  M  is  an  n-state  automaton, 
then  Lq  =  L(A(Lq!  ^,n-2))  if  i  >  2n  -2  +  c  where  o  =  1  if  the  state 
q  =  {  }  is  in  A(Lq| i_^,n-2)  and  o  =  0  otherwise. 

Proof.  Note  that  the  language  fov  any  n-state  automaton  is  in 
Z(n-l,n-2)  and  employ  the  previous  corollary. 
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The  lower  bound  on  i  in  Corollary  3.2  if  o  =  0  is  the  best 


performance  that  could  be  hoped  for  under  any  conditions.  That  is, 
there  exist  distinct  n-state  languages  which  are  identical  for  all 
strings  of  length  2n-3  or  less  so  that  no  algorithm  could  be  expected 
to  discover  with  only  that  much  information.  The  two  n-state 
machines  of  Figure  k  provide  an  example. 


A1  *2 
Figure  4.  but  L(A1>  ^  L(A2}  * 


Example  3.2.  This  section  will  be  concluded  with  an  example  which 
illustrates  the  various  results  given  above.  All  of  the  strings  of  length 
five  or  less  for  a  particular  finite  state  language  LQ  in  Z(2,2)  are 
input  to  the  algorithm  while  look-ahead  level  is  varied.  The  set  of 
strings  S  satisfies  the  requirements  of  Theorem  3.3  so  that  LQ  =  L(A(S,2))  . 
The  resulting  machines  are  shown  in  Figure  >  The  reader  may  check  that 
the  various  assertions  made  above  do  indeed  hold. 


i 

: 

I 


17 


[  a(a+b)  *b+b  (a+b)  (a+b)  *b  ]*b 


(ab*aa*b+b(a+b) b*aa*b) *b 
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Applications 


The  results  of  the  previous  sections  are  applicable  to  a  variety 
of  problems  in  the  synthesis  of  automata.  The  n on-deterministic 
automata  created  by  these  methods  can,  of  course,  be  converted  to 
minimal  deterministic  automata  by  standard  techniques  [U,  5,  6,  8]. 

We  will  sketch  briefly  algorithms  for  some  synthesis  problems. 

Suppose  one  is  given  two  disjoint  finite  sets  and  S 2  and 
asked  to  construct  a  machine  which  accepts  and  not  .  The 
required  machine  is  the  one  of  minimum  k  where  **eS0  implies 
y/L(A(S1?k))  .  This  construction  also  solves  the  problem  of 
constructing  a  machine  given  both  strings  and  non-strings  of  its 
language . 

If  one  is  given  a  bound  n  on  the  number  of  states  of  a  finite 
state  machine  and  all  of  the  strings  of  its  language  LQ  are  available, 
the  minimal  machine  for  LQ  can  be  found.  One  uses  k  =  n-2  and 
S  =  Ln|0  ,  and  computes  A(S,k)  as  described  above.  By  Corollary  5.2, 
L(A(S,k))  =  Lq  ;  the  minimized  version  of  A(S,k)  is  the  minimal  machine 
for  the  language  LQ  . 

One  can  also  solve  the  problem  of  finding  the  minimum  machine  for 
distinguishing  the  disjoint  infinite  finite-state  languages  and  Lg 
when  given  bounds  n^,ng  on  the  sizes  of  their  respective  machines. 

The  construction  of  the  paragraph  above  gives  minimal  machines  M^,Mg 
for  and  Lg  respectively.  The  smaller  of  these  machines  is  an 
upper  bound  on  the  size  of  the  desired  machine. 
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The  sort  important  use  of  our  algorithms  occurs  in  the  sequential 
learning  situation.  Suppose  one  is  given  a  new  string  at  tine 

and  asked  to  select  a  machine  which  describes  the  sequence  up  to 
time  .  This  is  the  analog  of  the  grammatical  inference  pr cclec  \Z ,  31 
which  was  our  original  motivation  and  is  a  model  of  scientific  reasoning 
and  other  hypothesis  forming  behaviour.  The  interesting  questions  in 
sequential  learning  are  the  nature  of  the  machines  end  the  limiting 
behaviour  of  an  algorithm  as  i  -  ®  .  We  have  developed  elsewhere  [3] 
a  number  of  general  results  on  this  subject.  These  show  that  there  is 
an  algorithm  which  will  choose  the  best  A^  at  each  i  and  will  be  such 
that  the  successive  become  ever  better  approximations  to  the  machine 
of  Lq  . 

There  are  two  important  advantages  of  the  methods  of  this  paper 
over  the  general  algorithms  described  in  [3].  The  latter  methods  depend 
on  enumerating  all  finite-state  machines  in  order,  while  the  construction 
of  Section  2  requires  one  to  consider  only  a  small  number  of  machines. 

In  addition,  for  fixed  k  ,  the  machine  ACS^k)  is  easily  constructed 
from  the  machine  AfS.^  ^,k)  .  This  notion  of  sequential  modification  of 
a  synthesized  machine  is  extremely  important  and  will  be  briefly  described 

Suppose  that  A(S^  k)  has  been  constructed  and  that  A(S^,k)  must 
now  be  found;  S^  =  S^^  U  {y^  .  Let  =  a]_a2a3*,,ar  ,  a^eE  ' 

Then  only  the  k+1  for  fewer)  states  g{e?a2<s  ,.. .  •ar.jc>Si_1,k)  , 

g(ala2a3 *  * ,ar-k+l,Si-  .jl* k)  ,  ...,  g(&]a2...ar,Si_1,k)  in  A(Si_1>2i)  are 
affected  by  the  addition  of  the  string  .  Each  of  the  sets 

g(ala2a5"  iar-k*j,Si-l,5c)  must  haVe  the  string  ar-k+ j+lar-k+j+2“  *ar 


20 


'» **  *  -*v 


added  to  it,  and  the  transition  function  f  must  be  correspondingly- 

changed,  These  alterations  in  A(S^_^,k)  are  enough  to  produce  the 

new  machine  A(Si,k)  .  Considering  the  Example  2.1,  if  the  string  abba 

is  added  to  S  ,  the  new  acceptor  can  be  constructed  by  altering  the 

states  g(abb,S,l)  and  g(abba,S,l)  .  g(abb,S,l)  --  {a}  becomes  fA,a}  , 

g(abba,S,l)  which  was  not  defined  in  the  original  construction  becomes  {A} 

* 

and  A(S  U  {abba},l)  now  accepts  exactly  the  set  ab  +  ab  ba  . 

We  have  shown  that  the  construction  of  Definition  2.4  will  produce 
machines  ACS^k)  which  have  desirable  properties.  It  remains  to 
describe  an  algoritlim  for  choosing  k  and  deciding  which  A(Si,k)  to 
call  the  machine  Ai  .  Suppose  one  is  given  an  upper  bound  n  on  the 
number  of  states  required  of  M  where  =  L(M)  .  Consider  the 
following  algorithm: 

Algorithm  4.1.  At  each  i  ,  compute  =  A(Si|2n_;;>n-2)  .  If 
Si  c  L(Bj)  then  ,  otherwise  A^  =  A(Si,n-2)  . 


Theorem  4.1. 

Algorithm  4.1  has  the  following  properties: 

a) 

Si 

CLfAj)  . 

*) 

If 

contains  a  special 

set,  Lq  c  L^)  . 

c) 

If 

i  then  for  all  i  >  j  , 

Proof.  Part  a)  follows  directly  from  Theorem  2.1,  part  b)  from 


the  discussion  following  Theorem  5-3,  part  c)  from  Corollary  3*2. 

If  one  assumes,  as  seems  reasonable,  that  every  string  in  LQ 
will  occur  as  sane  y^  ,  Algorithm  4.1  will  eventually  choose  only  a 
machine  which  generates  exactly  1^  .  This  is  known  in  the  literature 
as  the  algorithm  identifying  LQ  .  There  is  a  problem  in  that  Algorithm  4.1 
depended  on  an  a  priori  estimate  of  n,  the  size  of  a  machine  for  LQ  . 

It  is  shown  in  [3]  that  without  this  estimate,  no  algorithm  will  be 
able  to  identify  the  finite-state  languages  from  an  arbitrary  presentation 
of  Lq  .  If,  however,  the  presentation  includes  the  information  about 
which  strings  art-,  not  in  LQ  or  meets  certain  regularity  conditions, 
there  are  algorithms  like  4.1  which  will  identify  the  finite-state 
languages . 

Although  there  is  a  close  relation  between  the  problem  discussed 
here  and  the  grammatical  inference  problem  [2,  3],  the  criteria  for 
the  best  solution  to  the  problem  are  quite  different  and  this  leads 
to  a  number  of  other  differences  in  the  two  studies. 

5.  Another  Finite-State  Acceptor 

Another  machine  B(S,k)  which  accepts  S  at  look-ahead  level  k 
will  be  introduced  here  and  discussed.  The  properties  of  B(S,k)  are 
not  as  easily  characterized  or  as  nice  as  those  of  A(S,k)  ,  but  it  does 
handle  certain  problsns  better  than  A(S,k)  and  so  will  be  briefly 
discussed  here. 
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Definition  5.1.  Assume  that  zeE  ,  zyeS  for  some  ye£  , 

* 

S  c  E  ,  and  k  is  a  nonnegative  integer.  m{z,S,K)  will  be  defined 

* 

as  the  set  of  strings  weE  with  the  properties 

(1)  zwx  e  S  for  some  xeE  -  {A} 

(2)  length  (w)  =  k  . 

Definition  $.2.  e(z,S,k)  =  m(z,S,k)  lj  g(z,S,k)  •#  where 
fw^Wg, .  •  .,w^ }  •  y  is  defined  to  be  . .  .^y}  . 

e(z,S,k)  is  undefined  if  z  and  k  are  outside  of  the  domains 
specified  in  the  definition  of  g{z,S,k)  . 

* 

Definition  5«3.  If  S  is  a  finite  set  of  strings  from  E  , 
let  B(S,k)  be  the  finite  nondetermini stic  automaton 

B(S,k)  =  <Q,S,f,Q0,F> 

where 

v*  •  ji  r*. 

Q  =  {qe2  "  U  2  |  there  is  a  z  with  zweS  and  q  =  e(z,S,k)} 

E  =  a  finite  nonempty  set  of  input  symbols 

f (q, a)  =  {q'eQ  |  there  is  a  zeE  such  that 

e(z,S,k)  =  q  and  e(za,S,k)  =  q*} 

Q0  =  (e(A,S,k)} 

F  =  [qcQ,  |  #eq]  . 

The  proofs  of  the  following  theorems  are  nearly  identical  to  their 
respective  counterparts  of  Sections  2  and  3  and  will  not  be  repeated  here. 


23 


i 


j 


4 


i 


Theorem  5.1.  S  c  L(B(S,k))  for  all  nonnegative  v.  . 


Theorem  5«2.  L(3(S,k))  =  S  if  k  is  greater  than  or  equal 

to  the  length  of  the  longest  string  in  S  . 

Theorem  5.3.  L(B(S,kH))  c  L(B(S,k))  . 

There  is  no  immediate  analogy  to  Theorem  3*3  for  B(S,k)  . 

Since  the  states  of  B(S,k)  contain  more  information  than  those 
of  A(S,k)  ,  it  is  not  surprising  to  find  the  following  true: 

Theorem  3.4.  L(B(S,k))  c  L(A(S,k))  . 

Proof.  w  =  a^ag...aj  eL(B(S,k))  ,  a^E  ,  implies  the  existence 
of  states  q^, , q ^  in  B(S,k)  with  the  properties  (1),  (2), 
and  (3)  of  Definition  2.2.  This  implies  the  existence  of  states 
q^,q.|,q2, . .  in  A(S,k)  which  also  satisfy  (1),  (2),  and  (3). 

Let  q^  =  {w|w#eq^}  for  i  =  0,1,2, .  ..,j  .  Therefore  qi+1  €  f(Vai+i) 
in  B(S,k)  implies  there  is  a  zeZ  such  that  e(z,S,k)  =  q^  and 
e(zai+i,S,k)  =  q^+^  .  From  the  definition  of  e  ,  this  Implies  that 
g(z,S,k)  and  g(zai+1,S,k)  exist  as  states  in  A(S,k)  ,  and  we  have 
just  named  these  states  q^  and  q£+1  ,  respectively.  This  leads  to 
q*+1 e f(q^,ai+1)  in  A(S,k)  which  proves  (2)  of  Definition  2.2  for 
the  states  q^,q^, ...,q^  •  (1)  and  (3)  are  easy  to  check  and  so  it 

follows  that  wtL(A(S,k))  . 
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Thus  B(S,k)  accepts  S  and  it  accepts  fever  "extra"  strings 
than  A(S,k)  .  This  effect  is  extreme  in  certain  problems  and  may  be 
considered  quite  an  advantage.  However,  it  has  not  been  shown  that 
there  is  any  subset  S  of  a  language  which  will  yield  the  language 
using  B(S,k)  as  was  shown  for  A(S,  k)  ,  so  the  earlier  machine  may 
be  preferred. 

6.  Discussion  and  Summary 

This  paper  gives  two  algorithms  for  constructing  acceptors  from 
finite  sets  of  strings.  Both  algorithms  have  been  programmed  on  a 
computer  and  extensively  tested.  Typical  constructions  of  machines 
with  ten  or  twenty  states  take  a  few  seconds  or  less  to  complete. 

Many  other  versions  of  these  algorithms  are  possible,  and  some  were 
investigated  although  they  are  not  described  here. 

The  authors  are  not  aware  of  a  comparable  solution  to  this  problem 
elsewhere  in  the  literature.  The  algorithm  has  a  simple  operation,  and 
is  therefore  easy  to  program  and  fast  in  execution.  The  parameter  k 
enables  the  user  to  obtain  as  exact  a  fit  to  the  needed  behavior  as  he 
desires  at  the  cost  of  increasing  the  complexity  of  the  resulting  acceptor. 
The  simplicity  of  the  algorithm  makes  its  operation  easy  to  understand 
and  easy  to  characterize.  Finally,  the  system  has  the  distinct  advantage 
that  if  a  large  amount  of  computational  effort  is  invested  in  finding  an 
acceptor  for  a  set  of  strings,  changes  can  be  made  in  its  behavior  without 
the  necessity  of  starting  the  design  procedure  over  again.  If  A(S,k)  is 
created  to  accept  S  and  then  S  is  changed  slightly,  only  the  states  and 
transitions  in  A(S,k)  which  correspond  to  the  changes  in  S  need  to  be 
adjusted  to  obtain  a  new  acceptor. 
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Appendix 


A  more  detailed  proof  of  Theorem  3*3  will  be  included  hare. 

Definition  Al.  If  is  a  state  in  M  with  the  property  that 
for  ail  we£  t  then  Pj^  will  be  called  an  absorbing  state. 

Lemma  Al.  L(A(S,k))  =  if  and  only  if  for  each  state  p^eP 

of  M  there  is  a  set  {x.  -,x.  _, ...,x,  .  }  of  distinct  sets  x.  . 

1,X  1,«-  x,j 

of  states  qeQ  of  A(S,k)  such  that 

(1)  d(pQ,w)  =  pi  if  and  only  if  ffq^w)  =  fx^,  ••  3 

*  ^ 
where  we£  ,  {q.}  =  Q  unless  p^^  is  an  absorbing  state. 

d(pQ,,w)  -  absorbing  state  if  and  only  if  =  <p  . 

(2)  If  p  fD  then  x.  ,  fl  F/9  for  j  =  1,2, ...,j.  . 

i  x,  J  i 

(3)  If  p.^D  then  x.  .  D  F  =  <p  for  j  =  1,2,  ...,j  . 

a  1,  j  x 

Proof.  Assume  that  the  three  conditions  hold  and  observe  why 
it  follows  that  L(A(S,k))  =  LQ  .  Construct  the  deterministic  automaton 
II  =  <X,£, e,Xj.,E>  which  has  the  same  behavior  as  A(S,k)  .  [6] 

X  c  2Q 

e(x,a)  =  x'  if  x'  =  U  f(q,a) 

qcx 

xo  -% 

E  =  {x|x  fl  F  /-  cp} 
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The  three  conditions  per  tit  ion  the  states  of  9  into  sets  of 
states  vnich  are  equivalent  to  in  M  .  They  require  by  (1)  that  H 
be  in  a  state  of  Y^  if  and  only  if  M  is  in  and  by  (2)  and  (3) 
that  9  is  in  a  final  state  if  and  only  if  M  is  in  one.  So  M  and  9 
are  equivalent  by  conditions  (1),  (2),  and  (3),  and  9  and  A  are 
equivalent  by  construction.  Therefore,  L(A)  =  L(M)  *  LQ  . 

If  L(A(S,k))  =  LQ  ,  then  9  can  be  defined  as  above  and  the 
sets  Yt  =  {xeX  |  x  is  equivalent  to  p^P]  can  be  constructed.  Certainly 
d(pQ,%)  =  p^  if  and  only  if  e(xQ,w)  -  x  for  (me  of  the  xeYi  since  N 
must  go  into  a  state  which  is  equivalent  to  pi  .  So  property  (1)  holds. 
p^eD  implies  xeE  which  means  xflF/?  .  So  property  (2)  nolds  and 
property  (3)  holds  similarly. 

Proof  of  Theorem  3.3.  We  construct  A(S,k)  and  show  that  it 
satisfies  the  conditions  of  Lenina  Al.  Corresponding  to  state  p^:P 
of  M  we  construct  the  sets  x.  ,  ,  1  <  j  <  j.  ,  which  each  have  the 

Ip  J  1 

state  g(u£,I^,k)  where  d(pQ,uf)  =  pi  and  u£v£  =  zf  is  one  of  the 
strings  designated  in  (l).  (If  p.^  is  an  absorbing  state  then  construct 
only  the  set  x.  =  (  }  .)  Each  of  the  sets  x  ,  1  <  3  <  j,  ,  also 

ipi  lpj  -»■ 

will  contain  either  none  or  some  of  the  states  g(u£,L,yk-h)  for 

* 

1  <  h  <  k  where  a  set  x.  .  is  constructed  if  there  is  a  ye£  such 

—  —  X)  J 

that  f(qQ,y)  =  x.,  j  .  The  set  xi  ^  will  contain  nothing  else. 
Certainly  this  construction  can  be  done  since  by  the  construction  of  S 
there  must  be  a  g(u£,L0,k)  as  defined  for  each  p^eP  .  Furthermore, 
this  construction  partitions  all  of  the  sets  of  states  in  A(S,k)  since 

27 


no  states  can  exist  which  ore  not  of  the  font  g(u^,I^,  k-h)  for 
0  <  h  <  k  .  This  con  be  checked  by  studying  the  construction  of  S  . 

It  will  be  necessary  to  use  the  following  property  which  can  be 
easily  proved: 

Property  A.  If  d(pQ,y1)  =  d(p0>y„)  ,  then 
gCy^I^k)  =  g(y2jl^k)  • 

Condition  (1)  of  Leona  3-2  will  be  proved  by  induction  on  the 
length  of  w  . 

I.  If  length  (w)  =  0  then  d(p0,w)  =  pQ  and  f^w)  =  = 

fg(A^LQ^k)}  €  }  • 

II.  Assume  length(w)  =  h  ,  length(wa)  =  h+1  ,  and  condition  (1) 
holds  for  length(w)  =  h  .  d(pQ,wa)  =  p^  if  and  only  if  there  is 

a  Pj  such  that  d(pQ,w)  =  p^.  and  d(p^,a)  =  p.^  .  This  is  true  if  and 
only  if  f(qQ,w)  €  {x^x^g, . .  .,x^  }  and  d(p^,a)  =  p±  by  the 

t) 

induction  hypothesis.  It  remains  to  be  shown  that  the  last  stated 

conditions  hold  if  and  only  if  f(u_,wa)  e  fx.  _,x.  0, ...,x.  .  }  . 

J  Xfl  \,ct 

That  this  is  true  can  be  seen  by  examining  the  states  of 
x  €  (x.  .  }  and  observing  what  happens  as  the  input  symbol  a 

applied.  First  of  all  we  know  that  g(u,L^,k)  ex  where  d(pQ,u)  =  p^ 


?8 


is 


by  the  construction  of  x  .  Since  the  sert  S1  =  u  •  g(u,L0,k«-l)  is  a 
subset  of  S  and  Sg  =  ua  •  g(ua,I^,k)  is  a  subset  of  S  because 
Sg  c  Si  ,  we  have  g(ua,I^,k)  e.  f(g(u,I^,k),a)  by  definition  of  f  . 

(One  case  which  deserves  special  comment  is  when  g(ua,LQ,k)  =  <p  . 

This  is  dealt  with  in  the  next  paragraph-)  But  d(p^,ua)  =  p^  so  the 
newly  found  state  g(ua,l^,k)  is  exactly  the  state  which  was  incorporated 


into  all  of  x41,x4„, ...,x4  4  .  This  assertion  uses  Property  A  and  the 

V-  if- 

fact  that  gtu'jL.jk)  was  included  in  every  set  x41>x40, .  ..,x.  . 

u  ifi  yd 

where  d(p0,u»)  =  pi  .  Similarly  the  states  obtained  by  computing 


f(q,a)  for  all  other  q  in  x  can  be  examined  and  shown  to  be  of  the 


form  g(uSI^,k-r-l)  if  q  =  g(u,LQ,k-r)  .  It  is  left  to  the  reader 
to  fill  in  the  remaining  details  and  thus  verify  that 


Referring  to  the  previous  paragraph,  if  g(ua,LQ,k)  =  qp  then  one 
of  the  two  cases  will  follow: 


Case  I.  g(ua,I^,k+-l)  /  <p  .  If  g(ua,LQ,k)  =  q>  and  uaw/S  for 
any  weE  ,  then  g(ua,S,k)  will  be  undefined  and  the  proof  will  fail 
at  this  point.  This  is  why  it  is  necessary  to  have  the  term  cr^  included 
in  the  construction  of  S  .  If  g(ua,LQ,k)  =  <p  and  g(ua,LQ,k+l)  /  q> 
then  °u  =  1  and  u  •  g(u,I^.k*2)  c  S  .  Then  ua  •  g(ua,LQ,k<-l)  is  a 
subset  of  S  and  is  not  empty.  Therefore  there  will  be  a  w  such  that 
uaweS  so  that  g(ua,S,k)  will  be  defined  and  the  proof  will  go  through. 


Cage  II.  g(ua,  1^,  k+1)  =  qp  .  This  occurs  if  is  an  absorbing 
state.  This  is  true  if  and  only  if  g(ua,I^,k)  is  not  defined  which 
happens  if  and  only  if  f(g(u,I^,k),a)  is  empty.  So  (1)  of  the  Lenina 
follows  in  this  case  as  well. 

Conditions  (2)  and  (3)  of  the  Leona  are  clearly  true.  If  p^d) 
then  A€g(u,LQ,k)  where  d(pQ,u)  =  .  Then  g(u,I^,k)  eF  so  that 

x±,i  n  F  ^  ^  for  1111  $  =  l»2,3*...^di  •  If  then 

A  £  g(u,LQ,b)  for  any  nonnegative  integer  b  so  that  xi  ^  fl  F  =  9 
for  all  j  =1,2,3  •  This  completes  the  proof  of  the  Theorem. 
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Tha  approach  used  is  to  directly  construct  the  states  and  transitions  of  the  . 
acceptor  machine  from  the  string  information.  The  algorithms  Include  a  . 
parameter  which  enable  one  to  increase  the  exactness  of  the  resulting  machine's 
behavior  as  much  as  desired  by  increasing  the  number  of  states  in  the  machine. 
The  properties  of  the  algorithms  are  presented  and  illustrated  with  a  number 
of  examples. 


The  paper  gives  a  method  for  identifying  a  finite-state  language  from  a 
randomly  chosen  finite  subset  of  the  language  if  the  subset  is  large  enough 
and  if  a  bound  is  known  on  the  number  of  states  required  to  recognize  the  - 
language.  Finally,  we-di»er*ss  some  of  the  uses  of  the  algorithms  and  their 
relationship  to  the  problem  of  grammatical  inference,  ' 


